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Empirical likelihood is a popular nonparametric or semi-parametric 
statistical method with many nice statistical properties. Yet when the 
sample size is small, or the dimension of the accompanying estimating 
function is high, the application of the empirical likelihood method 
can be hindered by low precision of the chi-square approximation 
and by nonexistence of solutions to the estimating equations. In this 
paper, we show that the adjusted empirical likelihood is effective at 
addressing both problems. With a specific level of adjustment, the 
adjusted empirical likelihood achieves the high-order precision of the 
Bartlett correction, in addition to the advantage of a guaranteed solu- 
tion to the estimating equations. Simulation results indicate that the 
confidence regions constructed by the adjusted empirical likelihood 
have coverage probabilities comparable to or substantially more ac- 
curate than the original empirical likelihood enhanced by the Bartlett 
correction. 

1. Introduction. In applications such as econometrics, statistical finance 
and biostatistics, general estimating equations (GEE) in the form E{g(X; 9)} 
0, where g{x;9) is a vector-valued function of the observation vector x and 
the parameter vector 9, are often used to define the parameters of interest 
[Hansen (1982), Liang and Zeger (1986), Kitamura and Stutzer (1997) and 
Imbens, Spady and Johnson (1998)]. With a semi-parametric setup, scien- 
tists run a low risk of mis-specifying a probability model for the population 
under investigation. Particularly when the parameter is over-identified, that 
is, when the dimension of g is larger than the dimension of 9, the general- 
ized moment method (GMM), the empirical likelihood (EL) method or its 
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variations can be used for statistical inference [Hansen (1982), Owen (1988), 
Newey and McFadden (1994), Qin and Lawless (1994), Imbens (1997), Smith 
(1997) and Newey and Smith (2004)]. Many researchers, however, find that 
the finite sample properties of the statistics based on GMM or EL are often 
very different from the asymptotic properties at sample sizes common in 
applications [Hall and La Scala (1990), DiCiccio, Hall and Romano (1991), 
Corcoran, Davison and Spady (1995), Burnside and Eichenbaum (1996), 
Corcoran (1998) and Tsao (2004)]. High-order approximations to the finite 
sample distribution based on the Bartlett correction or bootstrapping can be 
helpful [DiCiccio, Hall and Romano (1991), Hall and Horowitz (1996), Brown 
and Newey (2002), Newey and Smith (2004) and Chen and Cui (2007)]. Yet 
they do not always live up to their promise, particularly for high-dimensional 
data [Corcoran, Davison and Spady (1995) and Tsao (2004)]. 

We propose a novel approach via adjusted empirical likelihood (AEL) 
[Chen, Variyath and Abraham (2008)] to achieve the high-order precision 
promised by the Bartlett correction. The AEL is obtained by adding a 
pseudo-observation into the data set. Its principal utility is to overcome 
the difficulty arising when the estimating equations have no solution; a so- 
lution is required in the EL approach. By using a conventional level of ad- 
justment, Chen, Variyath and Abraham (2008) found the AEL improves 
the approximation precision of the chi-square limiting distribution. More 
recently, Emerson and Owen (2009) discussed the level of adjustment for in- 
ference on multivariate population mean. However, the optimal level of ad- 
justment remains unknown. In this paper, we derive a high-order expansion 
of the adjusted empirical likelihood ratio statistic, specify an optimal level 
of adjustment that enables the high-order approximation, prove that the re- 
sulting AEL shares the same high-order precision as the Bartlett corrected 
EL (BEL) and construct a less biased estimator of the Bartlett correction 
factor that effectively improves the approximation precision. 

Although the AEL and the BEL have the same high-order precision, their 
finite sample performances differ. Simulation studies show that the AEL has 
better precision than the BEL in general, and especially under linear and 
asset-pricing models. The AEL with conventional level of adjustment, AELo, 
is found to have comparable precisions to the AEL under many models 
considered, but it lacks some generality. In particular, the AEL improves 
over the AELo under linear and asset-pricing models. 

2. The EL and the Bartlett correction. 

2.1. The empirical likelihood. To convey the idea, suppose we have x±,X2, 
. . . , x n as a random sample from a nonparametric population F(x) such that 
x £ W n with dimension m. Assume that the GEE model is defined by 



Eg(X;0)=0 
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for a (/-dimensional estimating function g and a p-dimensional parameter 9. 
The profile empirical likelihood function of 9 is defined as 



The empirical log-likelihood ratio function is defined by R n {9) = — 21og(n n x 
L n {9)) [see Owen (2001) and Qin and Lawless (1994)]. One celebrated prop- 
erty of the empirical likelihood is that under some general conditions, 



as n — > oo where 9$ is the t rue parameter value. This property is most 
convenient for the construction of confidence regions of 9, 



with c(l — a;q) being the (1 — a)th quantile of the chi-square distribution 
with q degrees of freedom, and 1 — a being the pre-selected confidence level. 
Such confidence regions are renowned for their data-driven shape, and there 
is no need to estimate any scalar parameters. For other results, such as when 
#o is replaced by its nonparametric maximum EL estimate 9, we refer to Qin 
and Lawless (1994). 

2.2. The Bartlett correction of the EL. The precision of the confidence 
region constructed by (2) can be poor, particularly when the sample size is 
small. To improve the precision of the coverage probability, we may calibrate 
the distribution of R n (9o) by bootstrapping or by high-order approxima- 
tions. We now review high-order approximation via the Bartlett correction. 

The Bartlett correction for a smooth function of means was first estab- 
lished by DiCiccio, Hall and Romano (1991) while estimating questions by 
Chen and Cui (2006, 2007). For ease of illustration, we consider the situation 
where p = q = 1 and g(x; 9) = x — 9. Under this model, the parameter 9 is the 
population mean. The chi-square approximation has precision 0(n _1 ) and 
the confidence interval of 9 based on the chi-square approximation may not 
have accurate coverage probabilities. The Bartlett correction can improve 
the approximation precision to 0(n~ 2 ). 

By the Lagrange method, when the solution to YH=iPi9{ x i'->^) = exists, 
we have 



(1) 




PR{R n {9 ) <x} = PR{ X 2 q <x} + Oin" 1 ) 




{9:R n (9)<c(l-a;q)} 



Rn(9) = Y,^g{l + Xg(x i ;9)} 



for a Lagrange multiplier A that is the solution to 




n 



(3) 
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Let a r = E{g(X\0)Y and A r = n- 1 Y^ =1 {g{x i ;9)} r - a r . Without loss of 
generality, we assume that either a 2 = 1 or we can replace g(x; 9) with 

— 1/2 

a 2 g(x;6). Assuming that 9 is the true parameter value, we can write 

A = Ai + A 2 + A 3 + p (n- 2 ) 

with 

Xi = A u \ 2 = a 3 Al-A 1 A 2 , 

A 3 = A X A\ + A\A 3 + 2a\A\ - 3a 3 AjA 2 - a A Af. 

Under some moment conditions, A r = O p (n~ r ^ 2 ) for r = 1,2,3. Substituting 
these expansions into the expression for R n (9), we get 

(4) R n (9) = n{R 1 + R 2 + R 3 } 2 + O p (n- 3 / 2 ) 
with 

Ri=A u 

R 2 = \a 3 A 2 - \A X A 2 , 

R 3 = \A X A 2 + f a§A? - \a 3 A\A 2 + \A\A 3 - \a±A\. 
DiCiccio, Hall and Romano (1991) find that the cumulants of 

n(l-b/n)(R 1 +R 2 + R 3 ) 2 
match those of the xf distribution to the order of n -3 / 2 when 

(5) b = — \a\. 

Furthermore, since R\ + R 2 + R 3 are smooth functions of general sample 
means, the result of Bhattacharya and Ghosh (1978) implies that 

pr{?i(1 - b/n)(R x +R 2 + R 3 ) 2 <x} = PR{ X j < x} + 0(n~ 2 ). 

More details are in the Appendix. 

In applications, the value b must be replaced by some root-re consistent 
estimator, and, in theory, the replacement does not affect the high-order 
asymptotic conclusion. Naturally, b is often replaced by a moment estimate. 

Another way to improve the finite sample performance is to use bootstrap 
calibration, that is, to estimate the sample distribution of the R n (9) via a 
bootstrap resampling scheme [see, e.g., Hall and Horowitz (1996)]. There 
are situations where the solution pj's to the constraints in (1) at 9 = 9q 
do not exist with nonnegligible probability. A convention adopted in this 
situation is to define R n (9) = oo. However, if PR{i? n (^o) = oo} > a, then 
pr{^„(^o) < c} < 1 — a for any finite c. Consequently, a bootstrap scheme 
can at most boost the coverage probability to 1 — PR{i? n (#o) = oo} which is 
still below the nominal level 1 — a. This problem is clearly also shared by 
the Bartlett correction [see also Tsao (2004)]. 
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3. The AEL and the high-order approximation. 

3.1. The adjusted empirical likelihood. For each given 6, the likelihood 
ratio function R n {6) is well denned only if the convex hull of 

(6) {g( Xi -e):i = l,2,...,n} 

contains the (/-dimensional vector 0. When n is not large, or when a good 
candidate (vector) value of 6 is not available, this convex hull often fails 
to contain [see, e.g., Chen, Variyath and Abraham (2008)]. Blindly set- 
ting L n (9) = as suggested in the literature fails to provide information 
on whether 6 is grossly unfit to the data or is in fact only slightly off an 
appropriate value. Let gi = g(xf,0), i = 1, . . . , n, and 

n 

g n +i = -a n g n = -a n n~ 1 'Y]gi 

i=l 

for some a n > 0. The adjusted (profile) empirical likelihood is defined as 

{n+l n+l n+l "| 

~[pi -Pi > o,^Pi = i,y^Piffi = o > 
1=1 1=1 1=1 J 

and the adjusted empirical likelihood ratio function as 

Rn(9; a n ) = -21og{(n + l) n+1 L n (9; a n )}. 

Because g n and g n +i are on opposite sides of 0, the AEL is always well de- 
fined. Namely, its value is always nonzero. When a n = o p (n 2//3 ), Chen, Variy- 
ath and Abraham (2008) showed that the first-order asymptotic properties 
of the EL are retained by the AEL, and a conventional a n = max{l, logn/2} 
was found useful in a number of examples. However, an optimal choice of 
a n remains unsolved. We next recommend a specific a n and show that the 
resulting AEL achieves the goal attained by the Bartlett correction. 

3.2. AEL with high-order precision. The level of adjustment at which the 
AEL has high-order precision is a n = b/2, where b is the Bartlett correction 
factor for the usual EL. This surprising relationship reveals an intrinsic 
relationship between the AEL and the Bartlett correction. Indeed, the proof 
of the following result is built on the Bartlett correction. 

Theorem 1. Suppose that xx,x%, . . . ,x n is a random sample from an 
m-variate nonparametric population F(x). Assume that the GEE model is 
defined by 



Eg(X;9)=0, 



G 
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where 9 is a p- dimensional parameter, g(X; 9) is a q-dimensional estimating 
function, and its characteristic function satisfies Cramer's condition, 

limsup|£exp{it T c/(X;<9)}| < 1. 

\\t\\-+oc 

Assume also that E\\g(X; 9)\\ 18 < oo and \ax{g(X;9)) is positive definite. 
Let 9o be the true parameter value and a n = a + O p (n~ l l 2 ) . Then 

R n (9 ; a n ) = n{R l + R 2 + i? 3 a} T {^i +R2 + R 3a } + P (n" 3/2 ), 

where R\, R2 and R% a will be given in (14) and (16). When a = 6/2 where 
b is the Bartlett correction factor for the usual empirical likelihood, 

YK{n{R 1 + R 2 + R 3a } T {Ri + R2 + R 3a } <x} = PR( X 2 q <x) + 0{n~ 2 ). 

Adding a pseudo-observation g n +i results in a slightly different i?3 a as 
compared to i?3 in Section 2.2. This explains the choice of the notation. 

When q = 1, the Bartlett correction factor b = — a|/3 > unless 
g(X; 9) degenerates. Hence, the pseudo-observation obtained by setting a n = 
6/2 or its suitable estimator satisfies the condition a n > required by the 
AEL. When q > 1, it is uncertain whether 6 > or not. While Theorem 1 
remains valid, there is a small probability that the AEL is not defined when 
6 < 0. We can easily avoid this problem by adding two pseudo-observations. 
Let 

{n+2 n+2 n+2 "| 

~[pi-Pi > 0,^Pi = 1, ^Pigi = > 
i=l i=l i=l J 

and let the adjusted empirical likelihood ratio function be 

R n (9]a ln ,a 2n ) = -21og{(?i + 2) n+2 L n (9; a ln , a 2n )} 

with 5 n +i = —a\ n g and g n +2 = fl2n5- When a 2n — a\ n = 6, the result of The- 
orem 1 remains. 

In general, the Bartlett correction factor 6 can be written as the differ- 
ence of two positive values. This decomposition gives us natural choices of 
a\ n and a 2n for multidimensional estimating functions. In simulations, we 
added a single pseudo-observation when q = 1 and two pseudo-observations 
when q>2. We also recommend this practice in applications. More detailed 
discussions about the Bartlett correction factor 6 are given in the next sub- 
section. 

When q > p where the parameter is over-identified, it is more efficient to 
construct confidence regions with 

A n (9; On) = R n (9; a n ) - inf R n (9; a n ). 

6 

When a n = 0, Chen and Cui (2007) show that A n (#o;0) is also Bartlett 
correctable. The result of Theorem 1 remains valid as follows. 
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Theorem 2. Assume the same conditions as in Theorem 1, and that 
there exists a neighborhood of 9q, N{9q) and an integrable function, h(x), 
such that 



for some R\, R 2 and R^a, an d there exists a Bartlett correction factor b 
such that when a = 6/2, 



PR{n{R l + R 2 + R 3a } T {Ri + R2 + Rsa} <x} = PR(Xp < x) + 0(n" 2 ). 



The expressions of the Bartlett correction factor b and Rj, j = 1,2, in 
Theorem 2 are the same as in Chen and Cui (2007). When a n = 0, R 3a also 
becomes their R3. More details and a brief proof are given in the Appendix. 

3.3. Estimation of the Bartlett correction factor b. We first consider the 
estimation of b in the case of Theorem 1. Even for the simplistic one-sample 
problem, Bartlett-corrected ordinary EL confidence intervals for the pop- 
ulation mean often have lower than nominal coverage probabilities when 
the Bartlett correction factor b is replaced by its moment estimator. The 
Bartlett-corrected EL intervals with theoretical b are often much more sat- 
isfactory. Our investigation reveals that the moment estimator of b usually 
grossly under estimates particularly when n is small, say n = 20, 30. See the 
simulation results presented in the next section. 

Let us first examine the case of q = p = 1 where the Bartlett correction 
factor is given by 



Note that we no longer assume a 2 = 1. The moment estimators of a r are 
given by a r = n~ l X^iLiG?? — dY ■ Since £"02 = (n — 1)02/^, we estimate a 2 
by a.2 = na 2 /(n — 1) to reduce bias. In summary, we use the estimators given 
in the following table to construct a less-biased estimator of b: 



sup \\d 3 g(x;6)/d9 3 \\ 3 <h(x). 





04 a| 
2a| 3a| 



Parameter Estimator 



Expression 
na 2 /(n — 1) 




(720:4 — 602)/ (n — 4) 



a 3 

"222 



"22 



a| — 6:4 /n 
naz/(n — 3) 



a i ~ («6 - <S|)/rj 
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The above choices are motivated as follows. Since 

r-,- 4«4 6a 2 , _ 2 , 

Ea A = a A H - + 0(n 2 ), 

n n 

Eal = a 2 2 + ^ + 0(n- 2 ), 
n 

Ea 3 = a 3 -^ + 0(n- 2 ), 
n 

we estimate a^, a 2 and 03 by 04 = (na4 — 6a 2 )/(n — 4), 022 = ot 2 . — a^/n 
and a 3 = na 3 /{n — 3), respectively. The biases of 04, 022 and 03 are of 
order 0(n~ 2 ) compared to the 0(n _1 ) biases of the corresponding moment 
estimators. Precise form of the (9(n _1 ) bias of a 2 is complex. Hence, we 
aim to reduce rather than completely eliminate the 0(n _1 ) bias. Since a 3 ~ 
— X^ILi 9i ' we nave approximately -Ed^ = a 3 and Ea 2 = a 2 + var(a3), and 
approximately var(a3) = (uq — a 2 )/n. 

When q = p > 1, the expression for b is more complex. Let V{9) = 
v&r{g(X;6)} be the covariance matrix. By eigenvalue decomposition, we 
may write 

such that PP T = I and £1, . . . ,£ q are eigenvalues of V(9q). Furthermore, let 
Y = P T g(X; 8q), and for any positive integers (r, s, . . . , t), define 

(9) a rs - t = E{Y r Y s ---Y t }, 

where Y t is the tth component of vector Y. 

It can be seen that after g is transformed by multiplying P, a rr = £ r and 
a rs = for r 7^ s. The Bartlett correction factor can then be written as 

*,rst„,rst 



ifly- a rrss ly, a rst a rst 1 



1 I' „rrrr „rrss 

q 1 ^ 2(a"') 2 + ^ 2a rr a ss 



Me 



(c0 2 (a „ a) 2 (a rt } 2 

3(a rr ) 3 a rr (a ss ) 2 ^ a rr a ss a l 

K ' r^s v ; r<s<t 

a rrrr (a rrr ) 2 \ 1 a™ (a rss ) 2 

q 2-;\2{a rr ) 2 ~ 3(a rr ) 3 J + 2q f^ s \a rr a ss ~ a rr (a ss ) 2 

2 („,rst\2 



1 f 1 v («-) 2 ^ (Q 2 ] 

g\2 ^ a rr (a ss ) 2 ^ a rr a ss a u j 



r=£s r<s<t 
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(a rss ) 2 \ 



Clearly, both b\ and 62 are positive and b = b\ — 62 • There can be other ways 
to decompose b. We have chosen the above decomposition so that both b\ 
and 62 are of moderate size. 

Note that the Bartlett correction factor (s) depends on the unknown 9q. 
In applications, we first compute a maximum adjusted empirical likelihood 
estimate 9 at a n = logn/2, and use it as a tentative replacement of 9q for 
estimating b or b\ and bi- We decompose the sample variance of g(x;9) at 
9 = 9 to obtain the orthogonal matrix P. We then obtain Yi = P T g(X{;9) 
and define the moment estimators as 

n 

(10) &""•* = n" 1 £ I?!?.. 

i=l 

To reduce the bias in the estimation of b\ and 62, we use the estimators 
given in the following table: 

Expression 
na rr I (n — 1) 
{na rrss - 2a rr a ss - 4J(r = s)a rr a rr }/{n - 4) 
na rst I [n — 3) 
^rst^rst _ (ftrrsstt _ a rst a rst ) / n 

a"'a ss — a rrss /n 
a rr a ss a tt 

for all 1 < r, s,t < q, and I(r = s) is the indicator function. We denote the 
resulting estimates as 61 and 62. For q > 1, we add two pseudo-observations 
with a\ n = bi/2 and a2n = ^2/2 in the simulations. 

To examine the bias properties of the new estimator, we generated 10,000 
sets of random samples from a number of selected univariate, bivariate and 
trivariate distributions. The population distributions are not important at 
this stage, and they will be specified in the simulation section. We computed 
the Bartlett correction factors and their average estimates for constructing 
confidence regions of the population mean. The outcomes are given in Tables 
1 and 2. The moment estimators are denoted as b n and the new estimators 
as b n . Clearly, the new estimators are much less biased under the normal, 
exponential and chi-square distributions. Under mixture distributions, b n 
overestimates 6, but the resulting AEL confidence intervals still have good 



Let 



1 ~ q^\2(a rr ) 2 ~ 3(a rr ) 3 J + qf^ Q \a rr a s 



IE 



(a rss ) 2 + 2 ^ (a rst ) 2 



a rr (a ss ) 2 q ' a rr a ss a u 

r<s v ' ^ r<s<t 



Parameter Estimator 

Q,rrss ^rrss 

a rst a rst 

(y rs i(y rs t ^rst,rst 

a rr a ss 5 rr, SS 

a rr a ss a tt a rr > ss > tt 
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Table 1 

Bartlett correction factors and their average estimates for univariate population mean 



n 




N(0,1) 


Exp(l) 


0.2iVi + 0.8iV 2 


xl 




b 


1.50 


3.17 


1.11 


4.83 


20 


b„ 


1.16 


1.40 


1.14 


1.59 




bn 


1.57 


3.19 


2.08 


5.56 


30 


l>n 


1.26 


1.66 


1.15 


1.96 




b n 


1.56 


3.17 


1.63 


5.12 



coverage properties. We also examined the bias properties under a number 
of linear models. The results are given in Table 3. Again, b n is much less 
biased. The model specifications are relegated to the simulation section. 

When q> p, we prefer A n (a) for constructing confidence intervals as in 
Theorem 2. However, as indicated in Chen and Cui (2007), it is impractical 
to estimate b by the method of moments as it involves many terms and high- 
order moments. In simulations, we used a robustified bootstrap estimate of 
b suggested by Chen and Cui (2007). 

4. Applications. 

4.1. Confidence regions for population mean. A classical problem is the 
construction of confidence regions or testing a hypothesis about a specific 
value of the population mean based on a set of n independent and identically 
distributed observations. Particularly for scalar observations, the standard 



Table 2 

Bartlett correction factors and their average estimates for multivariate (q = 2,3) 

population mean 



n 




(a) 


(b) 


(c) 


(d) 


q = 2 


b 


3.21 


3.71 


1.68 


2.21 


20 


l>n 


1.63 


1.67 


1.48 


1.46 




l>n 


2.93 


3.34 


2.55 


2.14 


30 


bn 


1.90 


1.98 


1.56 


1.64 




bn 


3.06 


3.47 


2.18 


2.20 


q = 3 


b 


4.07 


3.84 


2.36 


2.67 


30 


bn 


2.27 


2.24 


1.98 


2.00 




l>n 


3.72 


3.47 


2.62 


2.62 


50 


bn 


2.67 


2.61 


2.13 


2.22 




l>n 


3.89 


3.64 


2.52 


2.67 
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Table 3 

Bartlett correction factors and their average estimates under linear regression models 



n 




JV(0,1) 






Exp(l) 




b 


b n 


b n 


b 


bn 


bn 


30 


3.55 


2.39 


3.56 


7.98 


2.61 


5.39 


50 


3.53 


2.74 


3.61 


7.92 


3.35 


6.16 


100 


3.90 


3.28 


3.86 


9.00 


4.58 


7.07 



approach is to use the Studentized sample mean, 

for both purposes where x n is the sample mean, and s 2 is the sample vari- 
ance. When the population distribution is normal, T n {6) has a t-distribution 
with n — 1 degrees of freedom. The confidence interval or hypothesis test 
calibrated by the t-distribution is found to be accurate even for nonnormal 
population distributions and for moderate sample size n. For multivariate 
observations, the i-statistic is replaced by Hotelling's T 2 defined as 

T 2 (8) = n(X n -e) T S^(X n -9) 

with X n the vector sample mean and S n the sample covariance matrix. 
When the observations have a p-dimensional multivariate normal distribu- 
tion, (n — p)T 2 (6) I {p{n — 1)} has an F-distribution with p and n — p degrees 
of freedom. The F-distribution often serves as a reference distribution for 
both hypothesis tests and constructing confidence regions, whether or not 
the normality assumption holds. Surprisingly, the normal-theory-based con- 
fidence regions have reasonably accurate coverage probabilities even when 
the sample sizes are small and the population distributions deviate from the 
normal. Thus they serve as a good barometer to gauge the performance of 
a new method. 

The EL and AEL counterparts are obtained by letting g(x;9) = x — 9. For 
the sake of comparison, we use the same simulation set-ups as in DiCiccio, 
Hall and Romano (1991). We investigate the coverage probabilities of 90%, 
95% and 99% confidence intervals based on the following methods: 

(1) Hotelling's T 2 (including the univariate case), T 2 ; 

(2) The usual empirical likelihood, EL; 

(3) Bartlett-corrected empirical likelihood with moment estimate b n , BEL; 

(4) Adjusted empirical likelihood with moment estimate b n , AEL; 

(5) Bartlett-corrected empirical likelihood with b n , BEL*; 

(6) Adjusted empirical likelihood with b n , AEL*; 
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Table 4 

Coverage probabilities for one-sample population mean 





n 


T 1 

Level 


1 


2 


EL 


BEL 


AEL 


BEL* 


AhjL 


BEL t 


AEL t 


A 1 ' T 


N(0,1) 


20 


90 


90 


1 


88 


.2 


89 


.0 


89 


1 


89 


,3 


89.5 


89 


.3 


89 


.4 


91.0 






95 


95 


1 


93 


.2 


94 


.0 


94 


.0 


94 


2 


94.4 


94 


.2 


94 


.3 


95.4 






99 


98 


.9 


97 


.9 


98 


2 


98 


.3 


98 


.3 


98.4 


98 


.3 


98 


.4 


98.9 




30 


90 


90 


.2 


89 


.0 


89 


7 


89 


.8 


90 





90.0 


89 


.9 


89 


.9 


91.1 






95 


95 


.5 


94 


.3 


94 


.9 


94 


.9 


95 





95.0 


95 


.0 


95 


.0 


95.8 






99 


99 


1 


98 


7 


98 


.8 


98 


.8 


98 


.8 


98.8 


98 


.9 


98 


.9 


99.1 


Exp(l) 


20 


90 


87 


.5 


85 


.6 


86 


.8 


87 


.0 


87 


6 


88.2 


88 


.2 


88 


.9 


88.7 






95 


92 





n i 
91 


o 
.1 


n i 
91 


8 


n i 
91 


n 
.9 


no 


Q 
O 


92.8 


no 


.8 


no 
9o 


r 

.5 


93.4 






99 


96 


.6 


96 


7 


97 


.0 


97 


1 


97 


2 


97.4 


97 


.5 


98 


.0 


97.9 




30 


90 


87 


.6 


86 


7 


87 


7 


87 


.8 


88 


.2 


88.5 


88 


.6 


88 


.9 


89.0 






95 


92 


.8 


92 


.3 


92 


.9 


93 


.0 


93 


.3 


93.6 


93 


.7 


93 


.9 


94.0 






99 


97 


1 


97 


.6 


97 


.9 


97 


.9 


98 


.0 


98.0 


98 


.2 


98 


.3 


98.4 


0.2iVi + 0.87V 2 


20 


90 


88 


4 


88 


.4 


89 


.5 


89 


.5 


91 


.0 


91.8 


89 


.2 


89 


.2 


90.9 






95 


92 


.8 


93 


.3 


94 


.3 


94 


.3 


95 


.0 


95.5 


94 


4 


94 


4 


95.2 






99 


97 





97 


.8 


98 


.0 


98 


.0 


98 


1 


98.2 


98 


.0 


98 


.0 


98.4 




30 


90 


88 


.7 


89 


.1 


89 


,9 


89 


.9 


90 


.3 


90.4 


89 


.7 


89 


.8 


91.2 






95 


93 


7 


94 


.4 


94 


.9 


94 


.9 


95 


.3 


95.5 


94 


.7 


94 


.7 


95.6 






99 


97 


.8 


98 


.8 


99 


1 


99 


1 


99 


2 


99.3 


99 


.0 


99 


.0 


99.3 


xl 


20 


90 


84 


.8 


83 


7 


85 


.0 


85. 


.2 


86 


,1 


87.3 


87 


.2 


89 


.2 


86.7 






95 


89 


.2 


89 


.3 


90 


.4 


90 


.5 


91 


.3 


92.0 


92 


.2 


93 


.8 


91.7 






99 


94 


4 


95 


.4 


96 


.0 


96 


.0 


96 


.4 


96.8 


96 


.9 


98 


.5 


96.9 




30 


90 


85 


.9 


85 


.4 


86 


.5 


86 


.7 


87 


.7 


88.2 


88 


.2 


88 


.9 


87.8 






95 


90 


.2 


91 


.1 


91 


.9 


91 


.9 


92 


4 


92.7 


93 


.0 


93 


.6 


92.8 






99 


95 


.2 


96 


.5 


96 


.8 


96 


.8 


97 


.0 


97.2 


97 


.3 


97 


.7 


97.3 



(7) Bartlett-corrected empirical likelihood with known b value, BEL^; 

(8) Adjusted empirical likelihood with known b value, AEL^; 

(9) Adjusted empirical likelihood with level of adjustment a n = | log n, AELn. 

We generated 10,000 samples from four distributions: (a) the standard 
normal; (b) an exponential distribution with mean 1; (c) a normal mix- 
ture 0.2iV(5,l) + 0.8iV(-1.25,l); and (d) the x? distribution. The results 
are presented in Table 4 where 0.2iVi + O.8N2 denotes the normal mixture 
distribution. 

Under the normal model, T 2 is optimal, yet we find that the AEL* is as 
good within simulation error. The accuracy of the AEL* is consistently bet- 
ter than that of the BEL and BEL*. This is particularly true when the pop- 
ulation distribution is exponential or chi-square. Under the mixture model, 
the AEL* has a slightly higher than nominal coverage probability. Finally, 
we remark that under the chi-square distribution, all the methods still have 
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room for improvement when n = 20. Our simulation results on EL and BEL 
are comparable to those reported in the literature. 

In the multivariate case, we conducted simulation experiments for p = 2 
and p = 3. We used the following strategy to generate correlated trivariate 
observations. We first generated a random observation D from the uniform 
distribution on the interval [1, 2]. Given D, we generated Xi,X2 and X% 
from the distributions specified as follows: 

(a) X 1 ~N{0,D 2 ), Xs-Gammap" 1 ,!), X 3 ~Xd5 

(b) Xi ~ Gamma(L>, 1), X 2 ~ Gamma(L>- 1 , 1), X 3 ~ Gamma(4 - D, 1); 

(c) Zi ~ 0.2iV(5,L> 2 ) + 0.8iV(-1.25,L>- 2 ), X 2 ~ 0.2iV(5, L>~ 2 ) + 0.8JV(-1.25, 
D 2 ), X 3 ~N(0,D 2 ); 

(d) Xi ~ Poisson(D), X 2 ~ Poisson(Z)- 1 ), X 3 ~ Poisson(4 - D). 

When p = 2, we used Xi and X 2 in our simulation and generated 10,000 
data sets with sample sizes n = 20 and 30. When p = 3, we also generated 
10,000 data sets but increased sample sizes to n = 30 and 50 to accommodate 
the higher dimension. Table 5 presents the simulation results. 

We observe that the AEL* outperforms all other methods, often sub- 
stantially. Under the bivariate mixture model (c) at nominal level 95% and 
sample size n = 20, the AEL* has 93.7% coverage probability compared to 
91.1% for the BEL and 92.3% for the BEL*. This is significant because the 
AEL*, the BEL and the BEL* are known to be precise up to the same order 
n~ 2 . The difference in performances presumably comes from higher orders. 

We remark here that the above discussion has not taken AEL^ and AELo 
into account. The AEL^ is only for theoretical interest and its performance 
indicates how far AEL can be improved by choosing a better estimator of 
b. The AELo is the AEL with a conventional level of adjustment suggested 
in Chen, Variyath and Abraham (2008). It has comparable performance to 
AEL*. Due to a lack of theoretical justification, the observed good perfor- 
mance is hard to generalize. We will continue to keep an eye on its perfor- 
mance. 

4.2. Linear regression. The empirical likelihood method can also be used 
to construct confidence regions for the regression coefficient (3 in the follow- 
ing linear regression model: 

(11) 2/ = x T /3 + e, 

where /3 is a p-dimensional parameter, x a p-dimensional fixed design point 
and y the scalar response. Chen (1993) showed that the empirical likelihood 
confidence regions for (3 are also Bartlett correctable. In comparison, by 
letting g(y,x;/3) = x(y — x T /3) the proposed AEL method (AEL*) directly 
applies. 
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Table 5 

Coverage probabilities for one-sample multivariate (q — 2,3) population mean 





n 


Level 


T 2 


EL 


BEL 


AEL 


BEL* 


AEL* 


BEL t 


AEL t 


AEL 


= 2 fa) 


20 


90 


86.0 


81.7 


83.8 


84.3 


84.8 


86.2 


85.4 


87.0 


86.6 






95 


91.3 


87.7 


89.3 


89.8 


90.1 


91.6 


90.6 


92.2 


91.8 






99 


96.5 


94.5 


95.3 


95.9 


95.9 


96.7 


96.1 


97.9 


97.4 




30 


90 


87.2 


85.1 


86.5 


86.7 


87.0 


87.8 


87.4 


88.0 


88.2 






95 


92.2 


90.8 


91.7 


92.0 


92.2 


92.8 


92.4 


93.0 


93.2 






99 


97.0 


96.5 


97.0 


97.2 


97.2 


97.5 


97.4 


97.6 


97.8 




20 


90 


84.5 


80.8 


82.7 


83.4 


84.2 


86.2 


84.9 


87.2 


85.6 






95 


89.5 


86.8 


88.4 


89.1 


89.6 


91.1 


90.0 


92.5 


91.1 






99 


95.4 


93.6 


94.5 


95.0 


94.9 


96.2 


95.3 


98.5 


96.7 




30 


90 


85.9 


84.5 


86.0 


86.3 


86.8 


87.6 


87.1 


88.0 


87.6 






95 


90.7 


90.4 


91.6 


91.8 


92.2 


92.7 


92.5 


93.1 


92.9 






99 


96.1 


96.3 


96.8 


97.0 


97.1 


97.4 


97.3 


97.8 


97.6 


(c) 


20 


90 


85.7 


84.6 


86.2 


86.4 


87.7 


89.4 


86.2 


86.5 


88.8 






95 


90.6 


89.9 


91.1 


91.5 


92.3 


93.7 


91.2 


91.4 


93.2 






99 


95.8 


95.2 


95.7 


96.0 


96.0 


97.2 


95.7 


96.0 


97.1 




30 


90 


87.9 


87.4 


88.9 


89.0 


89.6 


90.0 


88.9 


89.0 


90.7 






95 


92.9 


93.2 


94.2 


94.3 


94.7 


95.1 


94.1 


94.2 


95.4 






99 


97.2 


98.0 


98.3 


98.4 


98.5 


98.8 


98.3 


98.4 


98.7 


(d) 


20 


90 


88.5 


84.2 


85.9 


86.2 


86.6 


87.4 


86.8 


87.6 


89.0 






95 


93.3 


90.2 


91.3 


91.6 


91.8 


92.5 


91.8 


92.4 


93.7 






99 


97.6 


95.8 


96.2 


96.5 


96.4 


97.0 


96.5 


97.0 


98.1 




30 


90 


88.4 


86.4 


87.6 


87.7 


87.9 


88.2 


88.0 


88.3 


89.8 






95 


93.6 


92.3 


93.0 


93.1 


93.3 


93.5 


93.3 


93.5 


94.2 






99 


98.0 


97.2 


97.6 


97.7 


97.8 


97.9 


97.8 


97.9 


98.4 



In this simulation study, we examined the performance of the AEL* 
method based on model (11) with p = 2, the true parameter value (3q = 
(1,1) T , and the errors E{ were generated from either a normal distribution 
or from a centralized exponential distribution as specified in Table 6. The 
design matrix of x of size n x 2 was taken from the first n rows in Table 1 of 
Chen (1993). The simulation results also are given in Table 6. The improve- 
ment of the AEL* over the EL, BEL or BEL* is universal and substantial, 
particularly under the nonnormal models when the sample sizes are small. 

4.3. An example where q> p. In this subsection, we examine the AEL 
through an asset-pricing model investigated by Hall and Horowitz (1996) 
and also by Imbens, Spady and Johnson (1998) expanded with q (q > 2) 
moment restrictions by Schennach (2007). The parameter of interest is de- 
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Table 5 
( Continued) 





n 


Level 


T 2 


EL 


BEL 


AEL 


BEL* 


AEL* 


BEL t 


AEL t 


AEL 


— 3 fa) 


30 


90 


85.2 


81.5 


83.8 


84.6 


84.9 


86.5 


85.3 


87.2 


86.0 






95 


90.6 


88.1 


89.8 


90.7 


90.6 


91.9 


91.0 


92.5 


91.6 






99 


96.2 


94.8 


95.6 


96.4 


96.1 


97.1 


96.2 


97.9 


97.0 




50 


90 


85.8 


84.4 


85.8 


86.2 


86.5 


86.9 


86.5 


87.0 


86.9 






95 


91.2 


90.7 


91.9 


92.2 


92.2 


92.7 


92.4 


92.8 


92.7 






99 


96.6 


96.6 


97.2 


97.5 


97.5 


97.7 


97.5 


97.8 


97.7 




30 


90 


85.3 


81.4 


83.6 


84.4 


84.8 


86.1 


85.2 


86.7 


86.0 






95 


90.8 


87.8 


89.7 


90.3 


90.4 


91.7 


90.8 


92.3 


91.6 






99 


96.4 


95.1 


95.9 


96.5 


96.3 


97.1 


96.4 


97.6 


97.2 




50 


90 


86.7 


85.7 


87.1 


87.4 


87.6 


88.0 


87.7 


88.1 


88.1 






95 


92.0 


91.1 


92.2 


92.5 


92.6 


92.8 


92.8 


93.1 


93.1 






99 


97.1 


97.5 


97.8 


97.9 


97.9 


98.0 


98.0 


98.1 


98.2 


(c) 


30 


90 


88.0 


84.7 


86.7 


87.0 


87.2 


88.0 


86.9 


87.4 


88.8 






95 


93.0 


90.5 


91.9 


92.3 


92.4 


93.1 


92.1 


92.5 


93.7 






99 


97.6 


96.5 


97.0 


97.3 


97.2 


98.0 


97.1 


97.3 


98.1 




50 


90 


88.7 


87.4 


88.7 


88.8 


89.0 


89.1 


88.8 


88.9 


90.0 






95 


93.5 


93.2 


94.1 


94.2 


94.3 


94.4 


94.2 


94.2 


94.9 






99 


98.2 


98.3 


98.6 


98.6 


98.6 


98.7 


98.6 


98.6 


98.9 


(d) 


30 


90 


88.4 


84.2 


86.1 


86.6 


86.7 


87.3 


86.7 


87.3 


88.4 






95 


93.7 


90.5 


91.9 


92.3 


92.3 


93.0 


92.4 


93.0 


93.7 






99 


98.1 


96.4 


97.2 


97.4 


97.3 


97.7 


97.3 


97.7 


98.3 




50 


90 


88.7 


86.8 


88.2 


88.3 


88.4 


88.5 


88.4 


88.5 


89.4 






95 


94.0 


92.9 


93.7 


93.8 


93.8 


93.9 


93.8 


93.9 


94.4 






99 


98.4 


97.9 


98.2 


98.3 


98.3 


98.3 


98.3 


98.3 


98.6 



fined through the following estimating equations: 

/ r(X,0) 
X 2 r(X,6) 
Eg(X;6)^E (X 3 -l)r(X,9) 



(12) 



\(X q -l)r(X,e)J 
6(X 1 + X 2 )+3X 2 }-1 



, x„ 



where r(X, 6) = exp{-0.72 - 9(X 1 + X 2 ) + 3X 2 } - 1, X = (X u X 2 , . . 
and 9 is a scalar parameter. Components of X are mutually independent 

and Xi,X 2 ' iV(0, 0.16), X 3 ,. .. ,X q l '~ ' \\- We generated data from the 
models with 6>o = 3, q = 2 and q = 3, respectively. 

Although Theorem 2 is applicable, precisely estimating b is not easy due to 
its complex expression. Instead, Chen and Cui (2007) proposed a bootstrap 
estimate. We adopted their strategy with a robust modification. Let A m 
be the sample median of A* (9; 0) based on B = 300 bootstrap samples. We 
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Table 6 

Coverage probabilities for the regression coefficient j3 





n 


Level 


F-test 


EL 


BEL 


AEL 


BEL* 


AEL* 


BEL t 


AEL t 


AEL 


JV(0, 1) 


30 


90 


90.0 


84.0 


85.7 


86.1 


86.6 


87.7 


86.6 


87.5 


87.4 






95 


94.9 


90.1 


91.5 


92.0 


92.2 


93.3 


92.3 


93.2 


93.0 






99 


99.3 


96.6 


97.3 


97.5 


97.4 


98.2 


97.5 


98.2 


98.1 




50 


90 


89.7 


86.9 


88.4 


88.5 


88.7 


88.9 


88.7 


89.0 


89.2 






95 


95.0 


92.7 


93.6 


93.7 


93.8 


94.0 


93.8 


94.0 


94.2 






99 


99.0 


97.7 


98.1 


98.2 


98.2 


98.2 


98.2 


98.3 


98.4 




100 


90 


89.6 


88.3 


89.1 


89.1 


89.2 


89.2 


89.2 


89.3 


89.4 






95 


94.8 


93.8 


94.3 


94.4 


94.4 


94.5 


94.4 


94.5 


94.5 






99 


99.0 


98.5 


98.6 


98.7 


98.7 


98.7 


98.7 


98.7 


98.8 


Exp(l) 


30 


90 


87.9 


79.6 


81.9 


82.4 


83.6 


86.1 


85.7 


92.6 


83.5 






95 


92.8 


86.4 


88.2 


88.8 


89.4 


91.6 


91.0 


98.5 


89.6 






99 


97.7 


93.7 


94.7 


95.2 


95.3 


97.0 


96.3 


100.0 


95.9 




50 


90 


88.7 


83.7 


85.4 


85.6 


86.4 


87.5 


87.4 


89.0 


86.0 






95 


93.8 


90.0 


91.3 


91.5 


92.1 


92.8 


92.9 


94.2 


91.8 






99 


98.3 


96.3 


96.9 


97.1 


97.3 


97.8 


97.7 


98.9 


97.2 




100 


90 


88.9 


86.2 


87.3 


87.3 


87.8 


88.1 


88.4 


88.8 


87.4 






95 


94.2 


92.2 


93.0 


93.0 


93.3 


93.6 


93.8 


94.2 


93.1 






99 


98.5 


97.8 


98.1 


98.1 


98.2 


98.3 


98.3 


98.5 


98.1 



estimate b by 

6 = n(A m /0.4549- 1), 

where 0.4549 is the median of the xl distribution. We generated samples of 
sizes n = 100 and 200. The average bootstrap estimates of b are 31 and 58 
for q = 2 and q = 3 over 1000 repetitions. We call them off-line estimates 
of b and carried out the corresponding simulations side-by-side with the 
bootstrap estimator b for each sample generated. 

In Table 7, we report the coverage probabilities of the nominal 90%, 
95% and 99% confidence intervals of the empirical likelihood (EL), the 
Bartlett corrected empirical likelihood (BEL), the adjusted empirical like- 
lihood [AEL(5)] and the adjusted empirical likelihood with conventional 
a n = log(n)/2 (AELo). Due to the exponential nature of g in 9 in this ex- 
ample, the sample mean g is unstable. For robustness, we computed g n +i 
with the trimmed mean by removing five largest \\gi\\ values. 

In terms of the precision of the coverage probabilities, the AEL is better 
than the BEL which is better than the EL and the AELo, an d the latter 
two have similar performances. Even after the robustification, the bootstrap 
estimation of b ranges from —27 to 376 when n = 100. This observation 
indicates that neither the BEL nor the AEL is ready to be applied to models 
similar to the one in this example. The simulation results have instead shown 
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Table 7 

Simulation results under the expanded asset-pricing model 







Level 


EL 


BEL 


AEL(5) 


BEL 


AEL(5) 


AEL 










Bootstrapped b 


Off-line b = 31 




q = 2 


















n — 


100 


90 


82.6 


86.5 


85.3 


87.4 


89.8 


82.7 






95 


88.4 


91.2 


92.6 


92.8 


95.4 


88.8 






99 


95.8 


96.7 


97.3 


97.3 


99.5 


95.9 


n — 


200 


90 


83.9 


86.6 


85.1 


87.8 


87.2 


84.3 






95 


91.2 


92.6 


91.9 


93.1 


93.3 


91.4 






99 


96.9 


97.4 


97.6 


97.8 


98.2 


96.9 










Bootstrapped b 


Off-line b = 58 




q = 3 


















n — 


100 


90 


78.4 


84.9 


84.1 


87.4 


90.5 


79.8 






95 


85.7 


90.8 


90.4 


93.1 


96.7 


86.1 






99 


94.0 


96.1 


97.9 


97.7 


99.8 


94.0 


n — 


200 


90 


82.5 


86.9 


86.5 


87.4 


89.8 


82.5 






95 


89.7 


92.7 


92.9 


93.3 


95.3 


89.8 






99 


96.1 


97.2 


98.5 


97.6 


99.2 


96.1 



the potential of the AEL approach. We hope to further investigate this 
problem in the future. 

APPENDIX 

Proof of Theorem 1. We now present the proof for the general case 
where g(x; 6) is vector valued. 

In addition to the notation introduced earlier, we further define 

n 

A TS-t = _y- yTyS . . . yt _ tfS-t 

i 

where a rs " 4 is defined in (10). Without loss of generality, we assume that 
a rs = I(r = s) at 9 = 9q. By DiCiccio, Hall and Romano (1991), the solution 
to (3), before any adjustment, can be expanded as 

A = Ai + A 2 + A 3 + Op(n" 2 ) 

with 

\[ = A r , A r 2 = -A TS A S + a rst A s A l 

and 

XI = A rs A tu A u + A rst A s A f + 2a rst a tuv A s A u A v 
- 3a rst A tu A s A u - a rstu A s A t A u . 
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Here we have used the summation convention according to which, if an 
index occurs more than once in an expression, summation over the index is 
understood. Substituting these expansions into the expression for i? n (#o), 
we get 

(13) R n (6 ) = n{R 1 + R 2 + R 3 } T {R! + R 2 + R3} + O p (n~ 3 / 2 ) 



with 



R{ = A r , R r 2 = \a rst A s A t -\A rs A s , 

3 ATS ASt At 5 ~VSt A til AS AIL 5 „ Stu ATS At AU 



(14) R r 3 = lA rs A st A l - ^a rst A tu A s A u - ^a^A^A'A 

+ §a rst a tuv A s A u A v + \A TSt A s A l - \a rstu A s A 1 A u . 
Recall the usual Lagrange multiplier A solves /(A) = where 

g(xi;0) 



f(C) = n- 1 Y^ yKh) 



Now we work on the Lagrange multiplier after an adjustment at level a n 
a + O p (n~ 1 / 2 ). Since A a = O p (n -1 / 2 ), it must solve 

f(X a )--g = O p (n- 2 ). 
n 

A Taylor expansion of /(A a ) gives 

/(A.) = /(A) + ^(A a - A) + 0((A a - A) 2 ). 
Since /(A) = 0, it simplifies to 



Note that 

^ = -E{g(X; 6 Q )g T (X; 6 )} + O v {n^ 



and by assumption E{g(X; 9o)g T (X; do)} = 1; thus we arrive at 

A a = A - n~ l ag + O p {rT 2 ) = (1 - n _1 a)A + O p (n~ 2 ). 

That is, the two Lagrange multipliers are nearly equal. 

Next, we quantify the effect of slightly different Lagrange multipliers on 
the expansion of R n (0o;a n ). We have 

n 

Rn(0o;a n ) = 2^1og{l + (1 - n~ 1 a)X T g i } 

i=l 

+ 21og{l - (1 - n- 1 a)a\ T g} + 0(n" 3/2 ). 
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Note that 

log{l - (1 - n~ l a)a\ T g} = -a\ T g + O p (n -2 ) 
and, surprisingly, 

n 

2 £>g{l + (i _ n~ 1 a)X T g i } = R n (9 ) + O p (n~ 3 / 2 ). 

i=l 

Therefore, we must have 

(15) R(9 ] a n ) = R n (6 ) - 2aRfR 1 + O p ( n - 3 / 2 ) 
where R\ is defined in (14), and, consequently, 

R n (9 ; a n ) = n{R 1 + R 2 + R 3a } T {Ri + R 2 + Rsa} + O p ( n -^ 2 ) 

with 

(16) R 3a = R 3 -n~ 1 aR 1 . 
Denote 

Q n = y/n{Rx + R 2 + Rsa), 

tt — (A 1 AQ A 11 A 12 All 4 111 A 112 AQ11\ T 

such that the super-indices in A rst satisfy l<r<s<t<q. Hence, U n has 
q{q + l)(q + 2)/6 components, and each component is a centralized sample 
mean. Furthermore, Q n is a smooth vector- valued function of U n . According 
to Bhattacharya and Ghosh (1978), the Edgeworth expansion of a smooth 
function of the sample mean (vector valued) is given by its formal Edgeworth 
expansion based on its cumulants. Depending on the required order of the 
expansion, the appropriate lower-order cumulants must exist. 

In this theorem, we look for an expansion of the density function of Qn 
up to order o(n~ 2 ). This expansion is determined by the first six cumulants 
of U n and the derivative of Q n with respect to U n . Note that we assumed 
that the 18th moment of g(x; 9) exists and the highest order in U n is three, 
hence all cumulants of U n up to order 6 exist. The cumulants of Q n can 
then be obtained through those of U n . 

Let 

K r,s,...,t{Qn) denote the joint cumulant of the rth, sth, . . . , tth compo- 
nents of Q n . After some lengthy but routine algebraic work, we get 

Kr{Qn) = -n" 1 / 2 // + ™~ 3/ M + o(n" 2 ), 

K r ,s(Qn) = I(r = s) + n~V s + n"V 2 s + o(n" 2 ), 

Kr,s,t(Qn) = n" 3/2 C^* + o(n" 2 ), 
K r ,s,t,u(Qn) = n~ 2 cl stu + o(n~ 2 ), 
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where 



7 " = i a raft - ±a rtu a st " - ±a rst a tuu - 2al(r = s) 

and c[, c™, c?j , c"* 1 * are some nonrandom constants. Cumulants of orders 
five and six are o(n~ 2 ). 

Let /q„ (z) and 0(z) be the density functions of Q n and the g-variate stan- 
dard normal distribution. The key consequence of the above computation is 
the resultant formal Edgeworth expansion, 



with 

7Ti(z) = //V, 

, 2 (z)=I(f + / iV){z r z s -I(r = S )} 

and for some polynomials ^(z) and ^(z) which are of order no more than 
four, the former is odd and the latter is even. Their specific forms are not 
needed further and so are omitted. 
The above expansion implies that 

PR{QlQn<x}= [ (l + ^n- i / 2 7r i (z)L(z)dz + (n- 2 ). 
J* T *<* I j=i J 

Because 7Tx(z) and ^(z) are odd functions, their integrations over the sym- 
metric region are zero. For the same reason, the integrations of the z r z s 
terms in ^(z) when r ^ s over a symmetric region are also zero. We further 
note that the expression of Y' s involves a, and it is simple to get 

I 7r 2 (z)(/>(z) dz = -{b - 2a) j (z T z - q)4>(z) dz, 

Jz T z<x ^ Jz T z<x 



I z 1 z<x 

where 



' _ rrss „ rst „ rst 

-a a a 



q\2 3 

This b is the Bartlett correction factor given in DiCiccio, Hall and Romano 
(1991). Its expression is simpler than the earlier one because we assumed 
a rs = I(r = s). Hence, when a = 6/2, we have 

PR{QlQ n <x}= j <P{z) dz + 0(n- 2 ) = pr(x 2 q < x) + 0(n~ 2 ). 

This completes the proof. □ 
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The conclusion for R n (6o;ai n ,a 2n ) is obtained similarly. 



Proof of Theorem 2. Expanding A n (9o;a n ) and then computing its 
cumulants are by far the most demanding parts of the proof of Theorem 
2. The tasks are formidable. Fortunately, we find a short-cut by relating 
A n (6 ;a n ) to A n (0 o ;O). By Chen and Cui (2007), 

A n (6 ; 0) = R n (6 ; 0) - inf R n (6; 0) 



= + R 2 + R 3 } T {Ri + R 2 + R 3 } + P (n" 3/2 ) 

for some R\,R 2 and R 3 ; some of which are different from those in DiCiccio, 
Hall and Romano (1991). They have the same fundamental properties that 
enable the Bartlett correction. In addition, R\ equals the first p components 
of n~ 1 X^iLi sO^i) ^o) after g is standardized in some way. 
With some relatively routine algebra, we find 

R n (9 ;a n ) = R n (9 ;0) - 2o n' 1 ^ /(X,; Q ) \ + O p (n~ z l 2 ) 

r=l { i=l ) 

and 

q ( n \ 2 

inf R n (6 ; an) = inf R n (6;0) - 2a ^ < n" 1 J> r 'PQ; #o) [ +O p (n" 3 / 2 ). 

r=p+l L i=l J 

Hence, 

P ( n ~\ 2 

A n (6 0] a n ) = A n (0)-2aY,ln- 1 Y,g r (X l -e )\ 

r=l I i=l J 

= n{R l + R 2 + R 3a } T {R! + R 2 + R 3a } + P (n~ 3 / 2 ), 



where 



^?3a — R3 Rl- 

n 



This proves the first part of Theorem 2. 

Again, according to Chen and Cui (2007), R\ + R 2 + R 3 have cumulants 
such that (1 — &/n)A n (#o; 0) is approximated by Xp to ra~ 2 precision. Tak- 
ing advantage of their proof and using a similar derivation to the proof of 
Theorem 1, we find A n (0Q;a n ) with a n = 6/2 + O p (n -1 / 2 ) is approximated 
by Xp to n~ 2 precision. This completes the proof. □ 
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